Remove Cluster 8, 9 : which are not important with respect to AT1 or AT2 markers. Kept Cluster 0 this time.
Recomputing the louvain clustering
Likely :: Cluster 0,2,4,7 - AT1 and Cluster 3,1,5,6 - AT2
The velocity projections are embedded on louvain clusters. Lets refer to the direction of the arrows based on clusters
Computes terminal states (root and end points).
The end points and root cells are obtained as stationary states of the velocity-inferred transition matrix and its transposed, respectively, which is given by left eigenvectors corresponding to an eigenvalue of 1, i.e.
Velocity projection by clusters
The most fine-grained resolution of the velocity vector field we get at single-cell level, with each arrow showing the direction and speed of movement of an individual cell.
AD10 sample i.e. control sample DAY 10 sample with yellow arrow clearly have two seperate velocity. Meaning, this sample has two cell type populations going in two directions.
Please be advised, not to limit biological conclusions to the projected velocities, but to examine individual gene dynamics via phase portraits to understand how inferred directions are supported by particular genes.
interpret a spliced vs. unspliced phase portrait. Gene activity is orchestrated by transcriptional regulation. Transcriptional induction for a particular gene results in an increase of (newly transcribed) precursor unspliced mRNAs while, conversely, repression or absence of transcription results in a decrease of unspliced mRNAs. Spliced mRNAs is produced from unspliced mRNA and follows the same trend with a time lag. Time is a hidden/latent variable. Thus, the dynamics needs to be inferred from what is actually measured: spliced and unspliced mRNAs as displayed in the phase portrait.
examine the phase portraits of some marker genes, visualized with scv.pl.velocity(adata, gene_names) or scv.pl.scatter(adata, gene_names).
Looks like stem cell genes ; which are known earlier does not explain velocity.
The black line corresponds to the estimated ‘steady-state’ ratio, i.e. the ratio of unspliced to spliced mRNA abundance which is in a constant transcriptional state. RNA velocity for a particular gene is determined as the residual, i.e. how much an observation deviates from that steady-state line.
Positive velocity indicates that a gene is up-regulated, which occurs for cells that show higher abundance of unspliced mRNA for that gene than expected in steady state. Conversely, negative velocity indicates that a gene is down-regulated.
Find genes that explains the directionality in the up-regulated and down-regulated genes.
Phase portrait for Fn1 gene explains the velocity nicely.
now lets look at some of the genes that came on top of the list in all four clusters.
Expression of S100a6 looks interesting it has highly unspliced cells and it falls in middle of DAY 7.
look at cluster 2 and cluster 5, which has Trf and Krt7 gene expression in all cells
We need a systematic way to identify genes that may help explain the resulting vector field and inferred lineages. To do so, we can test which genes have cluster-specific differential velocity expression, being siginificantly higher/lower compared to the remaining population. we will runs a differential velocity t-test and outpus a gene ranking for each cluster.
kwargs = dict(frameon=False, size=10, linewidth=1.5, add_outline='0, 1, 2, 3, 4, 5, 6, 7, 8')
scv.pl.scatter(cdata_subset, df['0'][:5], ylabel='0', kwargs) scv.pl.scatter(cdata_subset, df['0'][:5], ylabel='0', kwargs)
Two more useful stats: - The speed or rate of differentiation is given by the length of the velocity vector. - The coherence of the vector field (i.e., how a velocity vector correlates with its neighboring velocities) provides a measure of confidence.
These provide insights where cells differentiate at a slower/faster pace, and where the direction is un-/determined.
We can visualize the velocity graph to portray all velocity-inferred cell-to-cell connections/transitions.
the graph can be used to draw descendents/anscestors coming from a specified cell. Highlighted cells can be traced back to its potential fate.
Finally, based on the velocity graph, a velocity pseudotime can be computed. After inferring a distribution over root cells from the graph, it measures the average number of steps it takes to reach a cell after walking along the graph starting from the root cells.
Contrarily to diffusion pseudotime, it implicitly infers the root cells and is based on the directed velocity graph instead of the similarity-based diffusion kernel.
IGFBP7 and Napsa is clearly a marker for cells in transition.
STOP HERE AND DO NOT RUN BELOW. YOU CAN LOAD SAVE